Goto

Collaborating Authors

 action elimination


Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Neural Information Processing Systems

Learning how to act when there are many available actions in each state is a challenging task for Reinforcement Learning (RL) agents, especially when many of the actions are redundant or irrelevant. In such cases, it is easier to learn which actions not to take. In this work, we propose the Action-Elimination Deep Q-Network (AE-DQN) architecture that combines a Deep RL algorithm with an Action Elimination Network (AEN) that eliminates sub-optimal actions. The AEN is trained to predict invalid actions, supervised by an external elimination signal provided by the environment. Simulations demonstrate a considerable speedup and added robustness over vanilla DQN in text-based games with over a thousand discrete actions.



Cascading Bandits With Feedback

Prakash, R Sri, Karamchandani, Nikhil, Moharir, Sharayu

arXiv.org Artificial Intelligence

Abstract--Motivated by the challenges of edge inference, we study a variant of the cascade bandit model in which each arm corresponds to an inference model with an associated accuracy and error probability. We analyse four decision-making policies--Explore-then-Commit, Action Elimination, Lower Confidence Bound (LCB), and Thompson Sampling--and provide sharp theoretical regret guarantees for each. Unlike in classical bandit settings, Explore-then-Commit and Action Elimination incur suboptimal regret because they commit to a fixed ordering after the exploration phase, limiting their ability to adapt. In contrast, LCB and Thompson Sampling continuously update their decisions based on observed feedback, achieving constant O(1) regret. Simulations corroborate these theoretical findings, highlighting the crucial role of adaptivity for efficient edge inference under uncertainty.


Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Neural Information Processing Systems

Learning how to act when there are many available actions in each state is a challenging task for Reinforcement Learning (RL) agents, especially when many of the actions are redundant or irrelevant. In such cases, it is easier to learn which actions not to take. In this work, we propose the Action-Elimination Deep Q-Network (AE-DQN) architecture that combines a Deep RL algorithm with an Action Elimination Network (AEN) that eliminates sub-optimal actions. The AEN is trained to predict invalid actions, supervised by an external elimination signal provided by the environment. Simulations demonstrate a considerable speedup and added robustness over vanilla DQN in text-based games with over a thousand discrete actions.


Reviews: Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Neural Information Processing Systems

This paper addresses the challenge of an environment with discrete, but large number, of actions, by eliminating the actions that are never taken in a particular state. To do so, the paper proposes AE-DQN which augments DQN with contextual multi-armed bandit to identify actions that should be eliminated. Evaluation conducted on a text-based game, Zork, shows promising results, as AE-DQN outperforms baseline DQN on several examples. This idea of eliminating actions which are never taken in a given state is a sound on. The paper is clear and well written.


Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Zahavy, Tom, Haroush, Matan, Merlis, Nadav, Mankowitz, Daniel J., Mannor, Shie

Neural Information Processing Systems

Learning how to act when there are many available actions in each state is a challenging task for Reinforcement Learning (RL) agents, especially when many of the actions are redundant or irrelevant. In such cases, it is easier to learn which actions not to take. In this work, we propose the Action-Elimination Deep Q-Network (AE-DQN) architecture that combines a Deep RL algorithm with an Action Elimination Network (AEN) that eliminates sub-optimal actions. The AEN is trained to predict invalid actions, supervised by an external elimination signal provided by the environment. Simulations demonstrate a considerable speedup and added robustness over vanilla DQN in text-based games with over a thousand discrete actions.


Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Zahavy, Tom, Haroush, Matan, Merlis, Nadav, Mankowitz, Daniel J., Mannor, Shie

Neural Information Processing Systems

Learning how to act when there are many available actions in each state is a challenging task for Reinforcement Learning (RL) agents, especially when many of the actions are redundant or irrelevant. In such cases, it is sometimes easier to learn which actions not to take. In this work, we propose the Action-Elimination Deep Q-Network (AE-DQN) architecture that combines a Deep RL algorithm with an Action Elimination Network (AEN) that eliminates sub-optimal actions. The AEN is trained to predict invalid actions, supervised by an external elimination signal provided by the environment. Simulations demonstrate a considerable speedup and added robustness over vanilla DQN in text-based games with over a thousand discrete actions.


Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Zahavy, Tom, Haroush, Matan, Merlis, Nadav, Mankowitz, Daniel J., Mannor, Shie

Neural Information Processing Systems

Learning how to act when there are many available actions in each state is a challenging task for Reinforcement Learning (RL) agents, especially when many of the actions are redundant or irrelevant. In such cases, it is sometimes easier to learn which actions not to take. In this work, we propose the Action-Elimination Deep Q-Network (AE-DQN) architecture that combines a Deep RL algorithm with an Action Elimination Network (AEN) that eliminates sub-optimal actions. The AEN is trained to predict invalid actions, supervised by an external elimination signal provided by the environment. Simulations demonstrate a considerable speedup and added robustness over vanilla DQN in text-based games with over a thousand discrete actions.


Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Zahavy, Tom, Haroush, Matan, Merlis, Nadav, Mankowitz, Daniel J., Mannor, Shie

arXiv.org Machine Learning

Learning how to act when there are many available actions in each state is a challenging task for Reinforcement Learning (RL) agents, especially when many of the actions are redundant or irrelevant. In such cases, it is sometimes easier to learn which actions not to take. In this work, we propose the Action-Elimination Deep Q-Network (AE-DQN) architecture that combines a Deep RL algorithm with an Action Elimination Network (AEN) that eliminates sub-optimal actions. The AEN is trained to predict invalid actions, supervised by an external elimination signal provided by the environment. Simulations demonstrate a considerable speedup and added robustness over vanilla DQN in text-based games with over a thousand discrete actions.


Reformulating Planning Problems: A Theoretical Point of View

Chrpa, Lukáš (University of Huddersfield) | McCluskey, Thomas Leo (University of Huddersfield) | Osborne, Hugh (University of Huddersfield)

AAAI Conferences

Automated planning is a well studied research topic thanks to its wide range of real-world applications. Despite significant progress in this area many planning problems still remain hard and challenging. Some techniques such as learning macro-operators improve the planning process by reformulating the (original) planning problem. While many encouraging practical results have been derived from such reformulation methods, little attention has been paid to the theoretical properties of reformulation such as soundness, completeness, and algorithmic complexity. In this paper we build up a theoretical framework describing reformulation schemes such as action elimination or creating macro-actions. Using this framework, we show that finding entanglements (relationships useful for action elimination) is as hard as planning itself. Moreover, we design a tractable algorithm for checking under what conditions it is safe to reformulate a problem by removing primitive operators (assembled to a macro-operator).